AITopics | implement gradient descent

Collaborating Authors

implement gradient descent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Provable optimal transport with transformers: The essence of depth and prompt engineering

Daneshmand, Hadi

arXiv.org Machine LearningNov-1-2024

Can we establish provable performance guarantees for transformers? Establishing such theoretical guarantees is a milestone in developing trustworthy generative AI. In this paper, we take a step toward addressing this question by focusing on optimal transport, a fundamental problem at the intersection of combinatorial and continuous optimization. Leveraging the computational power of attention layers, we prove that a transformer with fixed parameters can effectively solve the optimal transport problem in Wasserstein-2 with entropic regularization for an arbitrary number of points. Consequently, the transformer can sort lists of arbitrary sizes up to an approximation factor. Our results rely on an engineered prompt that enables the transformer to implement gradient descent with adaptive stepsizes on the dual optimal transport. Combining the convergence analysis of gradient descent with Sinkhorn dynamics, we establish an explicit approximation bound for optimal transport with transformers, which improves as depth increases. Our findings provide novel insights into the essence of prompt engineering and depth for solving optimal transport. In particular, prompt engineering boosts the algorithmic expressivity of transformers, allowing them implement an optimization method. With increasing depth, transformers can simulate several iterations of gradient descent.

gradient descent, optimal transport, transformer, (11 more...)

arXiv.org Machine Learning

2410.19931

Country:

North America > United States > Ohio (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

How to implement Gradient Descent in Python

#artificialintelligenceDec-9-2019, 20:47:12 GMT

We will try to build a single neuron network, which can predict the admissions of a graduate school. The data we will use is shared above in google drive. The first 5 rows of data are shown below. The first column admit indicates whether the student is getting admitted to the school or not, this will be the target for our model; the second column gre and the third column gpa are numerical features for the student; the fourth column rank is a categorical feature. We will apply one-hot encoding to the categorical feature to add dummy columns.

cross-entropy loss function, implement gradient descent, loss function, (6 more...)

#artificialintelligence

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Machine Learning for Humans, Part 2.1: Supervised Learning

#artificialintelligenceMar-1-2019, 08:37:48 GMT

The goal of gradient descent is to find the minimum of our model's loss function by iteratively getting a better and better approximation of it. Imagine yourself walking through a valley with a blindfold on. Your goal is to find the bottom of the valley. How would you do it? A reasonable approach would be to touch the ground around you and move in whichever direction the ground is sloping down most steeply.

artificial intelligence, machine learning, training data, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback